Skip to content

feat: add multimedia endpoint support (image, TTS, transcription, video)#101

Merged
jpr5 merged 7 commits intomainfrom
worktree-sharded-rolling-tide
Apr 10, 2026
Merged

feat: add multimedia endpoint support (image, TTS, transcription, video)#101
jpr5 merged 7 commits intomainfrom
worktree-sharded-rolling-tide

Conversation

@AlemTuzlak
Copy link
Copy Markdown
Contributor

Summary

  • Add four new multimedia endpoint types: image generation (/v1/images/generations, /v1beta/models/{model}:predict), text-to-speech (/v1/audio/speech), audio transcription (/v1/audio/transcriptions), and video generation (/v1/videos, /v1/videos/{id})
  • Add match.endpoint field to FixtureMatch for isolating fixtures by endpoint type, preventing cross-matching (e.g., image fixtures won't match chat requests)
  • Add convenience methods (onImage, onSpeech, onTranscription, onVideo) on LLMock and backfill _endpointType on all existing handlers

New Endpoints

Route Method Format Match field
/v1/images/generations POST OpenAI promptuserMessage
/v1beta/models/{model}:predict POST Gemini Imagen instances[0].promptuserMessage
/v1/audio/speech POST OpenAI inputuserMessage
/v1/audio/transcriptions POST OpenAI (multipart) match.endpoint only
/v1/videos POST OpenAI promptuserMessage
/v1/videos/{id} GET OpenAI Stored video ID

Test plan

  • Image generation: single, multiple, base64, Gemini Imagen format
  • TTS: correct Content-Type for mp3/opus, default format fallback
  • Transcription: simple JSON and verbose_json with words/segments
  • Video: create + status check, processing state, 404 for unknown ID
  • X-Test-Id isolation for image endpoint
  • Endpoint cross-matching prevention (image vs chat)
  • Convenience methods (onImage, onSpeech, onTranscription, onVideo)
  • Backfill: endpoint: "chat" and endpoint: "embedding" fixtures match existing handlers
  • Full suite: 2216 tests pass, 0 failures

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Apr 10, 2026

Open in StackBlitz

npm i https://pkg.pr.new/@copilotkit/aimock@101

commit: a76ea32

Copy link
Copy Markdown
Contributor

@jpr5 jpr5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — Multimedia Endpoint Support

Well-structured PR. All 4 handlers follow consistent patterns, endpoint backfill is correct across all existing handlers, tests are strong (575 lines with specific assertions). One medium finding, two low.

Medium

Fixtures without endpoint match multimedia requests, then 500 at type guard (router.ts:44-48)

Endpoint filtering is one-directional: fixtures WITH endpoint are restricted, but fixtures WITHOUT endpoint match ANY request type. A user with a generic chat fixture:

mock.addFixture({ match: { userMessage: "guitar" }, response: { content: "Chat about guitars" } });

This matches image requests for "guitar". handleImages matches it, then isImageResponse(response) fails → 500. The test only verifies the reverse direction (image fixture doesn't match chat).

Fix: when a request has _endpointType and the matched fixture has no endpoint, verify the response type is compatible with the endpoint before returning the match. Or make filtering bidirectional.

Low

extractFormField regex on binary multipart data (transcription.ts:15-22) — readBody converts binary to UTF-8 string. If file part appears before text fields, mangled bytes could theoretically match the regex. Extremely unlikely with real audio but fragile. A boundary-delimited parser would be more robust.

_endpointType not a declared field (types.ts) — stored via index signature, no type safety. Adding _endpointType?: string to ChatCompletionRequest would catch typos.

Clean

  • Image gen (OpenAI + Gemini Imagen), TTS, transcription, video create/poll all correct
  • matchFixture endpoint filtering works for the designed direction
  • Convenience methods (onImage, onSpeech, etc.) wire correctly
  • Video state map with X-Test-Id isolation is correct
  • Backfill of _endpointType on all existing handlers is consistent

🤖 Reviewed with Claude Code

@jpr5 jpr5 force-pushed the worktree-sharded-rolling-tide branch 3 times, most recently from 785a371 to 8541b42 Compare April 10, 2026 21:33
@jpr5 jpr5 force-pushed the worktree-sharded-rolling-tide branch from 8541b42 to a76ea32 Compare April 10, 2026 21:35
@jpr5 jpr5 merged commit 752966a into main Apr 10, 2026
22 checks passed
@jpr5 jpr5 deleted the worktree-sharded-rolling-tide branch April 10, 2026 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants